Language Report Czech
نویسندگان
چکیده
Abstract This chapter provides basic data about Language Technology for the Czech language. After a brief introduction with general facts language (history, linguistic features, writing system, dialects), we touch upon in digital sphere. The main achievements field of NLP are presented: important datasets (corpora, treebanks, lexicons etc.) and tools (morphological analyzers, taggers, automatic translators, voice recognisers generators, keyword extracters etc).
منابع مشابه
Inter-Annotator Agreement on Spontaneous Czech Language
The goal of this article is to show that for some tasks in automatic speech recognition (ASR), especially for recognition of spontaneous telephony speech, the reference annotation differs substantially among human annotators and thus sets the upper bound of the ASR accuracy. In this paper, we focus on the evaluation of the inter-annotator agreement (IAA) and ASR accuracy in the context of imper...
متن کاملPortable Language Technology: Russian via Czech
We report on morphological tagging of Russian using very limited Russian resources. We train the TnT tagger (Brants, 2000) on a modified Czech corpus to get the transition probabilities. We believe that the two languages are similar enough for the transitional information to be useful. The Russian emission symbols are obtained using a morphological analyzer that does not rely on a manually crea...
متن کاملExploiting Linguistic Knowledge in Language Modeling of Czech Spontaneous Speech
In our paper, we present a method for incorporating available linguistic information into a statistical language model that is used in ASR system for transcribing spontaneous speech. We employ the class-based language model paradigm and use the morphological tags as the basis for world-to-class mapping. Since the number of different tags is at least by one order of magnitude lower than the numb...
متن کاملMaximum Entropy Named Entity Recognition for Czech Language
Named Entity Recognition (NER) is an important preprocessing tool for many Natural Language Processing tasks like Information Retrieval, Question Answering or Machine Translation. This paper is focused on NER for Czech language. The proposed NER is based on knowledge and experiences acquired on other languages and adapted for Czech. Our recognizer outperforms the previously introduced recognize...
متن کاملCzech language database of car speech and environmental noise
This paper will present new Czech language twochannel (stereo) speech database recorded in car environment. The created database was designed for experiments with speech enhancement for communication purposes and for the study and the design of a robust speech recognition systems. It respects car noise environment which is currently at the top of the interest. Tools for automated phoneme labell...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Cognitive technologies
سال: 2023
ISSN: ['2197-6635', '1611-2482']
DOI: https://doi.org/10.1007/978-3-031-28819-7_10